16 March, 2016
Nonspatial data has no location information
nonspatial = data.frame( id=c(1,2,3,4), data=rnorm(4) ) print(nonspatial)
## id data ## 1 1 -0.1140578 ## 2 2 1.0914729 ## 3 3 -0.4780712 ## 4 4 0.1558649
Spatial data has location information
The simplest spatial data are points on a map
spatial = data.frame( id=c(1,2,3,4), data=rnorm(4), x=runif(4,-180,180), y=runif(4,-90,90) ) print(spatial)
## id data x y ## 1 1 -1.06768167 8.198913 45.85735 ## 2 2 -0.36974434 103.754462 -84.13661 ## 3 3 0.03500162 -124.050690 39.74547 ## 4 4 0.41151930 43.201727 -40.59028
Which we can convert to explicitly spatial data using the sp package. Most GIS packages in R store data as sp classes.
library(sp)
The sp package has a method called coordinates that converts points to an sp class.
coordinates(spatial) = ~ x + y class(spatial)
## [1] "SpatialPointsDataFrame" ## attr(,"package") ## [1] "sp"
plot(spatial, axes=T)
Spatial data also needs a reference system or "projection" that allows us to represent spatial features on a map. Projections can be thought of as simply a coordinate system with an origin that is relative to a known point in space.
This is a whole field of mathematically intensive study termed "geodesy"
Much of the field of geodesy is jam-packed in the rgdal package, which is a wrapper for the Geospatial Data Abstration Library
library(rgdal)
rgdal includes a comprehensive list of projections that are typically represented as a string of parameters.
The most common is our standard latitude/longitude system, where the coordinates are angular and the origin is the equator directly south of Greenwich, England. The simplest projection string to denote this projections is:
"+proj=longlat"
To define the projection for spatial, we write to its proj4string slot:
proj4string(spatial) = "+proj=longlat"
Projections are a necessary evil for GIS users (to be continued)
With a projection associated with our spatial data, we can now relate it to other spatial data. In other words, let's make a map!
library(leaflet)
m = leaflet(data=spatial) %>% addTiles() %>% addMarkers() m
Vector = Polygons
Raster = Grid
Vector = Discrete
Raster = Continuous
Vector = Illustrator/Inkscape
Raster = Photoshop/GIMP
library(rgdal)
soils = readOGR(dsn="data", layer="soilsData")
## OGR data source with driver: ESRI Shapefile ## Source: "data", layer: "soilsData" ## with 75 features ## It has 27 fields
writeOGR( soils, "data", "soilsData_out", driver="ESRI Shapefile" )
class(soils)
## [1] "SpatialPolygonsDataFrame" ## attr(,"package") ## [1] "sp"
slotNames(soils)
## [1] "data" "polygons" "plotOrder" "bbox" "proj4string"
Take a peak here at the top of our attribute table
str(soils@data[,1:5])
## 'data.frame': 75 obs. of 5 variables: ## $ mukey : Factor w/ 25 levels "2774742","2774772",..: 12 19 12 6 9 5 7 24 25 13 ... ## $ muarcrs: Factor w/ 75 levels "0.40538405","0.90105194",..: 25 62 70 19 3 29 20 7 5 27 ... ## $ Sand1 : num 21.5 11 21.5 12 29.5 ... ## $ Sand2 : num 35.29 8.34 35.29 9.02 38.22 ... ## $ Sand3 : num 45.09 7.56 45.09 16.59 64.52 ...
*Make sure not to call head() or str() on a spatial object, as this spits out the first six features and all their attributes
str(soils@data[,1:10])
## 'data.frame': 75 obs. of 10 variables: ## $ mukey : Factor w/ 25 levels "2774742","2774772",..: 12 19 12 6 9 5 7 24 25 13 ... ## $ muarcrs: Factor w/ 75 levels "0.40538405","0.90105194",..: 25 62 70 19 3 29 20 7 5 27 ... ## $ Sand1 : num 21.5 11 21.5 12 29.5 ... ## $ Sand2 : num 35.29 8.34 35.29 9.02 38.22 ... ## $ Sand3 : num 45.09 7.56 45.09 16.59 64.52 ... ## $ Sand4 : num 48.6 30.6 48.6 36.7 33.2 ... ## $ Sand5 : num 0 31.1 0 27.1 57.2 ... ## $ Silt1 : num 45.8 65.2 45.8 68.7 54.5 ... ## $ Silt2 : num 37.3 64.9 37.3 60.3 43.5 ... ## $ Silt3 : num 36.7 64.1 36.7 34 21.1 ...
str(soils@polygons[1])
## List of 1 ## $ :Formal class 'Polygons' [package "sp"] with 5 slots ## .. ..@ Polygons :List of 1 ## .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots ## .. .. .. .. ..@ labpt : num [1:2] -90.1 43.1 ## .. .. .. .. ..@ area : num 1.19e-06 ## .. .. .. .. ..@ hole : logi FALSE ## .. .. .. .. ..@ ringDir: int 1 ## .. .. .. .. ..@ coords : num [1:21, 1:2] -90.1 -90.1 -90.1 -90.1 -90.1 ... ## .. ..@ plotOrder: int 1 ## .. ..@ labpt : num [1:2] -90.1 43.1 ## .. ..@ ID : chr "0" ## .. ..@ area : num 1.19e-06
Number of features
length(soils)
## [1] 75
Index the first feature with a slice
poly_1 = soils[1,]
plot( soils, main="Soils from Western WI", col=rainbow(5) )
A raster grid is rectangular.
Grid is another word for matrix.
Grid is another word for image.
A GIS raster grid is a matrix/image with an associated location and projection.
At a minimum, a GIS raster grid contains:
The rgdal rgdal packages is primarily for I/O and projecting GIS data
library(rgdal)
The raster package does everything rgdal does, but it includes lots of additional functionality.
library(raster)
elev = readGDAL("data/dem_wi.tif")
writeGDAL(elev, "data/dem_wi_out.tif")
elev = raster("data/dem_wi.tif")
writeRaster(elev, "data/dem_wi_out.tif")
The raster object elev has all the necessary pieces of spatial information:
elev
## class : RasterLayer ## dimensions : 284, 387, 109908 (nrow, ncol, ncell) ## resolution : 0.01666667, 0.01666667 (x, y) ## extent : -93.03262, -86.58262, 42.3949, 47.12823 (xmin, xmax, ymin, ymax) ## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 ## data source : C:\Users\ruesca\Documents\GeRgraphyPresentation\data\dem_wi.tif ## names : dem_wi ## values : 175, 565.4104 (min, max)
Which means we can make a map!
m = leaflet() %>% addTiles() %>% addRasterImage(elev, opacity=0.5) m
Remember that rasters are just matrices!
Therefore, most matrix operations can be applied to rasters. For example:
plot(
elev > 400,
col=c("red", "blue")
)
Rasters can be easily converted to matrices to do more complex work.
lat_grad = apply( as.matrix(elev), 1, mean, na.rm=T ) plot(lat_grad, type="l")
Most raster analysis ultimately executes some sort of overlay.
The issue:
To overlay two or more rasters, their projections, extents, and cellsizes must align perfectly.
This can be a difficult task.
What is the highest point in each county?
# Pseudo-code 1. Read in elevation data (raster grid) 2. Read in county boundary data (polygons) 3. Convert counties to raster grid that aligns with elevation grid 4. Find maximum elevation gridcell within each county
counties = readOGR("data", "WI_Counties")
## OGR data source with driver: ESRI Shapefile ## Source: "data", layer: "WI_Counties" ## with 72 features ## It has 7 fields
elev
## class : RasterLayer ## dimensions : 284, 387, 109908 (nrow, ncol, ncell) ## resolution : 0.01666667, 0.01666667 (x, y) ## extent : -93.03262, -86.58262, 42.3949, 47.12823 (xmin, xmax, ymin, ymax) ## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 ## data source : C:\Users\ruesca\Documents\GeRgraphyPresentation\data\dem_wi.tif ## names : dem_wi ## values : 175, 565.4104 (min, max)
proj4string(counties)
## [1] "+proj=tmerc +lat_0=0 +lon_0=-90 +k=0.9996 +x_0=520000 +y_0=-4480000 +ellps=GRS80 +units=m +no_defs"
proj4string(elev)
## [1] "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
extent(counties)
## class : Extent ## xmin : 294839 ## xmax : 770036.4 ## ymin : 225108.8 ## ymax : 734398.4
extent(elev)
## class : Extent ## xmin : -93.03262 ## xmax : -86.58262 ## ymin : 42.3949 ## ymax : 47.12823
cty_grid = rasterize(counties, elev, field="COUNTY_FIP") summary(cty_grid)
## layer ## Min. NA ## 1st Qu. NA ## Median NA ## 3rd Qu. NA ## Max. NA ## NA's 109908
prj = proj4string(elev) cty_prj = spTransform(counties, prj)
extent(cty_prj)
## class : Extent ## xmin : -92.88924 ## xmax : -86.8048 ## ymin : 42.49197 ## ymax : 47.08077
extent(elev)
## class : Extent ## xmin : -93.03262 ## xmax : -86.58262 ## ymin : 42.3949 ## ymax : 47.12823
plot(elev) plot(cty_prj, add=TRUE)
cty_grid = rasterize(cty_prj, elev, field="COUNTY_FIP") summary(cty_grid)
## layer ## Min. 1 ## 1st Qu. 18 ## Median 36 ## 3rd Qu. 55 ## Max. 72 ## NA's 50557
extent(cty_grid)
## class : Extent ## xmin : -93.03262 ## xmax : -86.58262 ## ymin : 42.3949 ## ymax : 47.12823
extent(elev)
## class : Extent ## xmin : -93.03262 ## xmax : -86.58262 ## ymin : 42.3949 ## ymax : 47.12823
library(dplyr)
ovly = data.frame(
elev=getValues(elev),
cty=getValues(cty_grid)
)
hi_pt = ovly %>%
group_by(cty) %>%
mutate(
elev = (elev == max(elev, na.rm=T)) * elev
) %>%
ungroup()
elev = setValues(elev, hi_pt[["elev"]])
elev[elev == 0] = NA
hi_pt_sp = rasterToPoints(elev, spatial=T)